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General issues of study design and 
analysis in the use of biomarkers in 
cancer epidemiology 

N. Pearce and P. Boffetta 

Other contributions to this volume have discussed sources of variation (see Vineis) and 
measurement error (see White). In this article, we focus on statistical issues involved in the 
design and analysis of epidemiological studies that use biomarkers. We do not consider 
statistical issues of laboratory analyses. 


In general, epidemiological research involves study¬ 
ing the external, modifiable causes of diseases in 
populations (McMichael, 1994,1995) with the in¬ 
tention of developing preventive interventions. In 
some instances, this activity can be enhanced by 
using internal biomarkers to obtain better mea¬ 
surements of internal exposure (dose), disease or 
individual susceptibility. Statistical issues in studies 
using biomarkers of exposure are not markedly dif¬ 
ferent from those involved in other epidemiological 
studies based on measures of external exposure, 
while studies using biomarkers of disease pose spe¬ 
cific problems due to the lack of persistence of 
some of these markers, and the analysis of Interac¬ 
tion is of particular interest in studies using mark¬ 
ers of susceptibility. As with other epidemiological 
studies, the statistical analysis of a study involving 
biomarkers involves in general: (1) relating a par¬ 
ticular disease (or health outcome, such as a marker 
of early effect) to (2) a particular exposure while (3) 
controlling for systematic error, (4) assessing inter¬ 
actions with other exposures and (5) assessing the 
possibility of random error. We will consider each 
of these five aspects of study design and analysis in 
turn. We will restrict our discussion to full-scale 
epidemiological studies, i.e. we assume that the use 
of biomarkers, be they of exposure, effect or suscep¬ 
tibility, aims to contribute to the elucidation of the 
causal relationships in human populations between 
diseases and factors such as external exposures, per¬ 
sonal habits, genetic traits and interventions. 
Issues In the design and analysis of transitional 
studies, in which the main aim is the validation of the 
markers themselves, are outside the scope of this 
chapici. 


Measuring disease with biomarkers 

Epidemiological studies are usually based on a par¬ 
ticular population followed over a particular period 
of time. Miettinen (1985) has termed this study 
population the 'base population' and its experience 
over time the 'study base'. The different epidemio¬ 
logical study designs differ only in the manner in 
which the study base is defined and the manner in 
which information is drawn from the study base 
(Checkoway et al., 1989). Thus, epidemiological 
studies may involve measuring either incidence or 
prevalence of disease. This distinction is important 
when a biomarker is being used to measure either 
the disease under study or early biological effects 
that arc considered to be valid predictors of disease 
risk (e.g. Rothman et a!., 1995). In particular, many 
studies measuring disease with biomarkers are of 
cross-sectional design and measure the prevalence 
of the disease, which is dependent on its incidence 
and its duration. Thus, in a study looking at markers 
of cell damage as an effect of exposure to known or 
suspected carcinogens, the results would depend on 
factors such as the turnover of the cells in which 
the marker is measured or the capacity' to repair 
the damage. For example, there is evidence that 
chromosomal aberrations caused by some carcino¬ 
gens, such as arsenic and benzene, last for longer 
periods than aberrations caused by vinyl chloride, 
and the reason for this difference is not known 
(Schwartz, 1990). 

Incidence studies 

Three measures of disease incidence are commonly 
used in incidence sturtfos. The (person-time) 
incidence rate (or incidence density; Miettinen, 
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19SS) is a measure of the disease occurrence per 
unit time. A second measure of disease occurrence 
is the cumulative incidence (incidence proportion; 
Miettinen, 1985) or risk, which is the proportion of 
study subjects who experience the outcome of in¬ 
terest at any time during the follow-up period. A 
third possible measure of disease occurrence is the 
incidence odds (Greenland, 1987), which is the ratio 
of the number of subjects who experience the out¬ 
come to the number who do not experience the 
outcome. As for the cumulative incidence, the in¬ 
cidence odds is dimensionless, but it is necessary to 
specify the time period over which it is being mea¬ 
sured. In incidence studies involving biomarkers 
of disease, it is therefore important to consider 
whether a particular biomarker measures incidence 
or cumulative incidence. 

Corresponding to these three measures of dis¬ 
ease occurrence, there are three principal ratio 
measures of effect that can be used in cohort stud¬ 
ies. The measure of primary interest is often the 
rate ratio (incidence density ratio), which is the 
ratio of the incidence rate in the exposed group to 
that in the non-exposed group. A second com¬ 
monly used effect measure is the risk ratio (cumu¬ 
lative incidence ratio), which is the ratio of the cu¬ 
mulative incidence in the exposed group to that in 
the noil-exposed group. When the outcome is rare 
over the follow-up period, the risk ratio is approx¬ 
imately equal to the rate ratio. A third possible 
effect measure is the incidence odds ratio, which is 
the ratio of the incidence odds in the exposed 
group to that in the non-exposed group. An anal¬ 
ogous approach can be used to calculate measures 
of effect based on the differences rather than the 
ratios, in particular the rate difference and the risk 
difference. 

In incidence studies involving biomarkers of dis¬ 
ease, it is therefore important to consider whether 
a particular biomarker measures incidence or cu¬ 
mulative incidence. For example, if the 'disease' 
under study is hepatitis B virus (HBV) infection, 
then a survey of the prevalence of HBV markers, in 
a cohort that has been followed over time, will in¬ 
dicate the cumulative incidence of infection in the 
cohort (with the exception of those who have died 
from any cause during follow-up or who no longer 
show evidence of infection). It will not directly in¬ 
dicate the incidence rate of infections; this would 
require repeated prevalence surveys over time. 


Incidence case-control studies 
Incidence case-control studies involve studying all 
of the incident cases of disease generated by the 
study base and a control group sampled at random 
from the same study base. The relative risk measure 
is the incidence odds ratio; the effect measure that 
this estimates depends on the manner in which 
controls are selected. Once again, there arc three 
main options (Pearce, 1993). 

One option is to select controls from those who 
do not experience the outcome during the follow¬ 
up period, i.e. the survivors (those who did not 
develop the outcome at any time during the follow¬ 
up period). In this instance, a sample of controls 
chosen by cumulative incidence sampling will es¬ 
timate the exposure odds of the survivors, and the 
odds ratio obtained in the case-control study will 
therefore estimate the incidence odds ratio in the 
base population. Controls can also be sampled 
from the entire base population (those at risk at 
the beginning of follow-up), rather than just from 
the survivors (those at risk at the end of follow-up). 
In such case-base sampling, the controls will esti¬ 
mate the exposure odds in the base population of 
persons at risk at the start of follow-up, and the 
odds ratio obtained in the case-control study wili 
therefore estimate the risk ratio in the base popu¬ 
lation. The third approach is to select controls 
longitudinally throughout the course of the study 
(Miettinen, 1976); this is sometimes described as 
'risk-set sampling' (Robins et a!., 1986), 'sampling 
from the study base' (the person-time experience; 
Miettinen, 1985) or 'density sampling' (Kleinbaum 
et at., 1982). In this instance, the controls will esti¬ 
mate the exposure odds in the study base, and the 
odds ratio obtained in the case-control study will 
therefore estimate the rate ratio in the study base. 

in incidence case-control studies involving bio¬ 
markers of disease, it is therefore important to con¬ 
sider whether a particular biomarker measures 
incidence or cumulative incidence. These issues 
determine not only which measure of effect is 
being estimated, but also which method of control 
selection is appropriate, and the resulting methods 
of data analysis. 

Prevalence studies 

The term prevalence denotes the number of cases 
of disease existing in the population at the time the 
study was conducted. If we denote the prevalence 
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of a disease in the study population by P, and if we 
assume that the incidence rate is constant over 
time, the population has reached a ‘steady state' 
and there is no migration into or out of the preva¬ 
lence pool, then it can be shown (Rothman, 1986) 
that the prevalence odds is equal to the incidence 
rate (I) multiplied by the average disease duration 

m 

P/(l - P) = I x D 

Thus, the prevalence odds is directly propor¬ 
tional to the disease incidence, and the prevalence 
odds ratio Is estimated to be; 

OR = / 1 D 1 // 0 D 0 

An increased prevalence odds ratio may thus re¬ 
flect the influence of factors that increase the dis¬ 
ease incidence and/or factors that increase disease 
duration. The different mechanisms involved in 
increasing disease incidence or disease duration are 
likely to involve different time patterns of expo¬ 
sure and disease {see below), which in turn are 
likely to require different biomarkers for measure¬ 
ment of the etiologlcally relevant exposures. 

Prevalence case-control studies 
Just as an incidence case-control study can be used 
to obtain the same findings as a ML cohort study, 
a prevalence case-control study can be used to ob¬ 
tain the same findings as a full prevalence study in 
a more efficient manner. In particular, if obtaining 
exposure information is difficult or costly (e.g. if it 
involves serum samples), then it may be more effi¬ 
cient to conduct a prevalence case-control study 
by obtaining exposure information on all of the 
prevalent cases of the disease under Study and a 
sample of controls selected at random from the 
non-cases. In this instance, a sample of controls 
will estimate the exposure odds of the non-cases, 
and the odds ratio obtained In the prevalence 
case-control study will therefore estimate the pre¬ 
valence odds ratio in the base population, which in 
turn estimates the incidence rate ratio, provided 

that the average duration of disease is the same in 
the exposed and non-exposed groups. Once again, 
an increased prevalence odds ratio may reflect the 
influence of factors that increase disease incidence 
and/or factors that increase disease duration, and 


the different mechanisms involved are likely to re¬ 
quire different biomarkers for measurement of the 
etiologically relevant exposures. 

Measuring exposure with biomarkers 

Validity of biomarkers of exposure 
There are considerable shortcomings in many cur¬ 
rently available biomarkers of exposure, including 
problems of measuring historical exposures; un¬ 
certainties as to what a biomarker is measuring; 
greater susceptibility to confounding in some in¬ 
stances; problems of application to public health 
policy (Pearce et al., 1995); the disease process 
affecting the level of the biomarker; and problems 
of validity of laboratory measurements (Boffetta, 
1995). These issues are covered elsewhere in this 
volume, in this section we concentrate on issues of 
study design and analysis when measuring expo¬ 
sure with biomarkers, particularly with regard to 
time-related exposures with a relatively long in¬ 
duction and latency period between exposure and 
the subsequent occurrence of disease (as is the sit¬ 
uation in most cancer epidemiology studies). The 
issues we discuss are not unique to a particular 
study design (cohort studies, case-control studies, 
cross-sectional studies) but rather apply to all stud¬ 
ies in which the etiologically relevant time period 
involves a relatively long induction and latency 
period, thereby posing problems with the mea¬ 
surement of exposure during this period. 

Time-related exposures 

Some biomarkers measure factors that are fixed and 
do not change over time in an individual, e.g. genetic 
susceptibility genes that may interact with xeno- 
biotic factors in cancer causation (Rothman, 1995). 
Other binmarkers measure factors that change over 
time, e.g. micronutricnt levels in serum may change 
from day-to-day (Willett, 1990). 

In studies (both prospective and retrospective) 
of long-term health effects involving time-related 
exposures, it is important that the time patterns of 
the study exposure and of the relevant con- 
founders should be taken into account in the 
analysis (Pearce eta!., 1986). In particular, it is im¬ 
portant that the principal exposure under study 
should be analysed in a time-related manner, 
taking account of the likely induction and latency 
periods, and the relative etiological importance 
of exposure intensity, exposure duration and 
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cumulative exposure. The simplest approach is to 
analyse the cumulative exposure in a time-related 
manner, and this may suffice when the aim is 
merely to consider whether or not there is an effect 
of exposure. However, once it has been provision¬ 
ally assumed that an effect exists, attention then 
shifts to understanding the nature of the effect. In 
this context, the temporal pattern of exposure and 
outcome can be considered by examining the 
effects of exposures in specific time windows while 
controlling for time-related confounders and for 
the effects of exposures in other time windows. A 
more sophisticated approach is direct fitting of a 
theoretical model of carcinogenesis (Pearce, 1992), 
which requires assumptions as to the relevance of 
the times in which the exposure occurred and in 
which the marker was measured. This may be par¬ 
ticularly relevant in studies including the mea¬ 
surement of the exposure, the marker and the dis¬ 
ease, which are mainly aimed at elucidating the 
role of the marker in the exposure-disease rela¬ 
tionship (Schatzkin etal., 1993). An example of this 
type of study is the investigation of the role of human 
papilloma virus (HPV) in the etiology of cervical 
cancer (the 'exposure' in this case being factors 
such as number of sexual partners and age at first 
intercourse). In this case, the association with HPV 
infection, when properly measured with PCR-based 
assays, is of such a magnitude that there Is little 
concern about the relevance of the time periods 
(Munoz et al., 1992). However, for cancers from 
other organs, the association with HPV infection is 
less clear-cut, and the relevance of the timing of 
the infection may be one of the unknown factors 
modifying the exposure-marker-disease relation¬ 
ship. 

Thus, biological measures of time-related expo¬ 
sures must be able to measures changes in expo¬ 
sure levels over time. In particular, stored biologi¬ 
cal samples may not provide valid measurements 
of long-term patterns of exposure when there are 
significant variations in exposure over time, unless 
samples have been taken repeatedly over the 
course of the study (Armstrong et al., 1992). If it is 
not possible to take repeated biological samples over 
time, then it is essential that the samples that are 
taken relate to the etiologically relevant time period. 

Many currently available biomarkers only indi¬ 
cate relatively recent exposures. For example, it is 
well known that serum levels of micronutrients re¬ 


flect recent rather than historical dietary intake 
(Willett, 1990); given the long induction time of 
most cancers it is usually exposures between 10 
and 30 years previously that are etiologically rele¬ 
vant. While this may not be a limitation in cross- 
sectional studies, provided that the etiologically 
relevant time period is close to the time of data col¬ 
lection, it is an important limitation in cohort and 
case-control studies aiming to assess the effects of 
historical exposures. Some biomarkers are better 
than others in this respect (particularly biomarkers 
for exposure to biological agents), but even the 
best markers of chemical exposures reflect only the 
last few weeks or months of exposure. On the other 
hand, with some biomarkers (e.g. serum levels of 
TCDD; Johnson et al., 1992) it may be possible to 
estimate historical levels if the exposure period is 
known, if the half-life is relatively long (and is 
known) and if it is assumed that no significant 
exposure has occurred more recently, or if it is 
reasonable to assume that exposure levels have re¬ 
mained stable over time. 

However, historical information on exposure 
surrogates will often be more valid than current 
direct measurements of exposure or dose. This 
situation has long been recognized in occupational 
epidemiology, where the use of work history 
records in combination with a job-exposure matrix 
(based on historical exposure measurements of work 
areas rather than individuals) is usually more valid 
than current exposure measurements (whether based 
on environmental measurements or biomarkers) 
because of changes in exposure levels over time 
(Checkoway et a!., 1989). Similar problems may 
occur in the measurement of other carcinogenic 
exposures. For example, even the best currently 
available measures of exposure to tobacco smoke, 
such as plasma or urinary cotinine, appear to have 
similar validity to questionnaires for the measure¬ 
ment of current exposures; their very short half- 
life makes them inferior to questionnaires in the 
estimation of historical exposures (Pearce et al., 
1995). On the other hand, some biomarkers would 
appear to have value in the validation of question¬ 
naires (Forastiere et al, 1993), which can then be 
used to estimate historical exposures. 

Another example in which timing of sample 
collection is of great importance is in the case of 
DNA adducts (Wilcosby & Griffith, 1990). Since 
most adducts are readily repaired, any measure of 
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exposure based on DNA adducts will depend on 
the time between the end of exposure and sample 
collection; this pattern can then be modified by 
factors such as the activity of repair enzymes, 
which in turn may also have an independent In¬ 
fluence on the outcome, i.e. may be associated 
with case-control status. DNA adduct formation 
and repair are particularly problematic, since: 

• the extent to which the amount of adducts 
measured represents the amount of biologically 
active adducts formed (as discussed above) is 
usually not known; and 

• in the case of measurements taken during ex¬ 
posure (or immediately thereafter, such as at the 
end of a working day) it is not known how 
much of the adducts found would persist long 
enough to be biologically important. 

Analysis based on pooled samples 
These issues of the timing of sample collection are 
of particular concern in analyses based on pooled 
samples. If the samples were not all taken at the 
same etiologically relevant time period, then the 
pooled sample will represent the average of sam¬ 
ples taken from a variety of time periods. Thus, 
there may be considerable misclassification with 
regard to the exposure levels at the etiologically 
relevant time period. 

Systematic error 

The major possible types of systematic error (bias) 
are the same in traditional epidemiology and in 
studies involving biomarkers (Boffetta, 1995). The 
various types of bias can be grouped into three 
major classes: selection bias, information bias, and 
confounding (Rothman, 1986). This section is not 
intended to give a comprehensive review of these 
types of bias; rather, we will concentrate on issues 
involving data analysis. One solution is to pool 
small sets of samples stratified on the basis of time 
since collection; however, this procedure may sub¬ 
stantially reduce the advantages of pooling. 

Selection bias 

Selection bias involves biases arising from the pro¬ 
cedures by which the study participants are chosen 
from the study base. Selection bias can be avoided 
by including all of the study base (i.e. a cohort 
sturdy) and obtaining a response rate of 100%. This 


is often not practicable, but selection bias can also 
be controlled in the analysis by identifying factors 
that are related to subject selection and controlling 
for them as confounders. The statistical issues in¬ 
volved in controlling for selection bias in the 
analysis are essentially the same as those involved 
in controlling for sources of confounding (see 
below). 

Information bias 

Information bias involves misclassification of the 
study participants with respect to disease or expo¬ 
sure status. Thus, the concept of information bias 
refers to those people actually included in the 
study, whereas selection bias refers to the selection 
of the study participants from the study base, and 
confounding generally refers to non-comparabil¬ 
ity of subgroups within the study base. The various 
methodological issues of validity, reproducibility 
and stability of markers are part of the more gen¬ 
eral problem of information bias. 

Non-differential information bias occurs when 
the likelihood of misclassification of exposure is 
the same for both cases and non-cases of disease 
(or when the likelihood of misclassification of dis¬ 
ease is the same for exposed and non-exposed per¬ 
sons). Non-diffcrcntial misclassification of exposure 
generally biases the relative risk estimate towards 
the null value of 1.0 (Copeland etal., 1977). Hence, 
non-differential information bias tends to produce 
'false negative' findings and is of particular con¬ 
cern in studies that find no association between 
exposure and disease. 

Differential information bias occurs when the 
likelihood of misclassification of exposure is differ¬ 
ent between cases and non-cases (or the likelihood 
of misclassification of disease is different between 
exposed and non-exposed persons). This can bias the 
observed effect estimate in either direction, either 
towards or away from the null value. 

Information bias can drastically affect the validity 
of a study. As a general principle, it is important to 
ensure that the misclassification is non-differential, 
by ensuring that exposure information is collected 
in an identical manner in cases and non-cases (or 
that disease information is collected in an identical 
manner in the exposed and non-exposed groups). 
In this situation, the bias is in a known direction 
(towards the null), and although there may be 
concern that not finding a significant association 
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(between exposure and disease) may be due to 
non-differential information bias, at least one can 
be confident that any positive findings are not due 
to information bias. Thus, the aim of data collection 
is not to collect perfect information, but to collect 
comparable information in a similar manner from 
the groups being compared, even if this means 
ignoring more detailed exposure information if 
tills is not available for both groups. However, it is 
clearly important to collect information that is as 
detailed and accurate as possible, within the con¬ 
straints imposed by the need to ensure that infor¬ 
mation is collected in a similar manner in the 
groups being compared. 

In general, cross-sectional and case-control 
studies based on biomarkers of exposure are more 
prone to differential misclassification than studies 
based on measurement of external exposure and 
disease, since the biomarkeis may be influenced by 
the disease itself. This problem is less relevant in 
prospective studies (or nested case-control studies) 
in which the marker is measured on biological 
material collected before the onset of disease, pro¬ 
vided that cases diagnosed within a short interval 
after sample collection are excluded. The fact that 
the relationships between exposure, marker and 
disease are in most cases obscure limits the inter¬ 
pretation of the findings of biomarker-based studies 
with respect to the presence or absence of infor¬ 
mation bias. 

Confounding 

Confounding occurs when the exposed and non- 
exposed groups (in the study base) are not compa¬ 
rable due to inherent differences in background 
disease risk (Greenland & Robins, 1986) caused by 
exposure to other risk factors. The concept of con¬ 
founding thus generally refers to the study base, 
although as noted above, confounding can also be 
introduced (or removed) by the manner in which 
study participants are selected from the study base. 

If no other biases are present, three conditions 
are necessary for a factor to be a confounder 
(Rothman, 1986). First, a confounder is a factor 
that is predictive of disease (in the absence of the 
exposure under study); second, a confounder is 
associated with exposure in the study base; and 
third, a variable that is intermediate in the causal 
pathway between exposure and disease is not a 
confounder. This latter issue is of particular con¬ 


cern in studies using biomarkers, since the identi¬ 
fication of potential confounders depends on pre¬ 
vious knowledge of the relationship between these 
and the relevant variables of exposure and out¬ 
come, and such knowledge is, for most biomaikers, 
very poor. 

The most straightforward method of control¬ 
ling confounding in the analysis involves stratify¬ 
ing the data into subgroups according to the levels 
of the confounder(s) and calculating a summary 
effect estimate that summarizes the information 
across strata. However, it is usually not possible to 
control simultaneously for more than two or three 
confounders when using stratified analysis. This 
problem can be mitigated to some extent by the 
use of mathematical modelling, but this may in 
turn produce problems of multicollinearity when 
variables which are highly correlated are entered 
simultaneously into the model. For example, 
serum levels of various micronutrients may be 
strongly correlated, and multicollinearity may 
occur when they are included in the same model. 
This will lead to unstable effect estimates with 
large standard errors, and may in fact lead to the 
'wrong' micronutrient showing the strongest asso¬ 
ciation (negative or positive) with disease. This 
may be one reason why numerous studies have 
shown that the consumption of green and yellow 
vegetables is protective against a range of cancers, 
but the identification of the specific dietary micro¬ 
nutrients involved has remained elusive (Steinmetz 
& Potter, 1991), 

In general, control of confounding requires 
careful use of a priori knowledge, together with 
assessment of the extent to which the effect esti¬ 
mate changes when the factor is controlled in the 
analysis. Most epidemiologists prefer to make a 
decision based on the latter criterion, although it 
can be misleading, particularly if misclassification 
is occurring (Greenland & Robins, 198S). The 
decision to control for a presumed confounder can 
certainly be made with more confidence if there is 
supporting prior knowledge that the factor is pre¬ 
dictive of disease. 

Misclassification of a confounder leads to a loss 
of ability to control confounding, although con¬ 
trol may still be useful provided that misclassifica- 
tton of the confounder was unbiased (Greenland, 
1980). Misclassification of exposure is more prob¬ 
lematic, since factors that influence misclassifica- 
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tion may appear to be confounders, but control of 
these factors may increase the net bias (Greenland 
& Robins, 1985). 

When appropriate information is not available 
to control confounding directly, it is still desirable 
to assess its potential strength and direction. For 
example, it may be possible to obtain information 
on a surrogate for the confounder of interest. Even 
though confounder control will be imperfect in this 
situation, it is still possible to examine whether the 
main effect estimate changes when the surrogate is 
controlled in the analysis, and to assess the 
Strength and direction of the change. Alternatively, 
it may be possible to obtain accurate confounder 
information for a subgroup of participants (cases 
and non-cases) in the study and to assess the effects 
of confounder control in this subgroup. 

A related approach involves obtaining con¬ 
founder information for a sample of the study base 
(or a sample of the controls in a case-control 
study). For example, in a study based on question¬ 
naires, biomarkers may be used to validate 
questionnaire information in a subgroup of study 
participants. 

The potential for confounding is of major con¬ 
cern in all epidemiological studies, including those 
using biomarkers. The use of biomarkers of exposure 
does not reduce the need to control for confound¬ 
ing, and in some instances the use of biomarkers 
may actually introduce confounding into a study. 
For example, in a study of lung cancer and PAH ex¬ 
posure in a group of factory workers, if the workers 
are classified according to PAH exposure on the 
basis of industrial hygiene monitoring, then the 
percentage of smokers (and the mean number of 
cigarettes smoked per day) will usually be similar 
in the groups with low, medium and high levels of 
occupational exposure to PAH (since these groups 
are defined purely on environmental levels of PAH 
exposure in various job categories and depart¬ 
ments, and these exposures will usually be unrelated 
to cigarette smoking). In this situation, cigarette 
smoking will not be a major confounder. However, 
if workers are classified according to PAH-DNA 
levels, these will indicate total exposure to PAHs 
from ali sources, including cigarette smoking. 
Thus, cigarette smokers will be more likely to be in 
the 'high PAH exposure' group, and this group will 
therefore contain a higher proportion of smokers 
(and a higher mean number of cigarettes smoked 


per day) than the medium or low PAH exposure 
group (since some of the total PAH exposure comes 
from cigarette smoke). The dose-response will 
then be confounded by cigarette smoking, and the 
'high PAH exposure' group may show a higher 
lung cancer risk which is not due to PAII exposure, 
but which is actually due to the other carcinogenic 
constituents of cigarette smoke (Pearce ei al„ 
1995). One solution is to stratify the analysis on 
cigarette smoking (as measured by questionnaire), 
but any confounding control is likely to be imper¬ 
fect. This is even more of a problem if biomarkers 
are used to measure tobacco smoking because of 
the problems of measuring the etioiogically rele¬ 
vant constituents of tobacco smoke (as distinct 
from exposure to PAHs in tobacco smoke). Once 
again, confounding control is likely to be imper¬ 
fect and, therefore, to yield results that are still 
confounded and less valid than those obtained by 
only considering occupational exposures (using a 
job-exposure matrix). Furthermore, one can only 
control for known confounders (e.g. tobacco smoke) 
and cannot control for unknown confounders that 
may also be subject to the same types of biases 
described above. Therefore, it is usually preferable 
to avoid confounding rather than to attempt to 
control for it post hoc (which is why randomized 
trials are the preferred method when they are fea¬ 
sible). Thus, it is preferable to consider only occu¬ 
pational exposures to PAHs, using a job-exposure 
matrix, and not to attempt to measure non-occu- 
pational exposures to PAH using biomarkers. 
Finally, it should be stressed that the above issues 
have been discussed in terms of studies in which 
exposure is measured prospectively; the problems 
are even more acute when historical exposures are 
being assessed. 

Random error 

Random error will occur in any epidemiological 
study, just as it occurs in experimental studies. It is 
often referred to as 'chance', although it can per¬ 
haps more reasonably be regarded as 'ignorance' 
(Checkoway et al, 1989). Even in an experimental 
study in which participants are randomized into 
'exposed' and 'non-exposed' groups, there will be 
'random' differences in background risk between 
the compared groups, but these will diminish in 
importance (i.e. the random differences will 'even 
out') as the study size grows. In epidemiological 
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studies, there is no guarantee that differences in 
baseline (background) risk will even out between 
the exposure groups as the study size grows, but it 
is necessary to make this assumption in order to 
proceed with the study (Greenland & Robins, 
1986). In practice, the study size depends on the 
number of available participants and the available 
resources. Within these limitations, it is desirable 
to make the study as large as possible, taking into 
account the trade-off between including more par¬ 
ticipants and gathering more detailed information 
about a smaller number of participants (Greenland, 
1988). 

A major problem with the use of biomarkers of 
exposure and outcome in cancer epidemiology 
studies (and particularly in cohort studies) is that 
of small numbers. Even large multicentre cohort 
studies often struggle to obtain sufficient numbers 
to assess risks of rare cancers from occupational ex¬ 
posures (Fingerhutefa/., 1991; Saracci etal, 1991). 
The use of biomarkers may be a major problem in 
this regard, since the resulting expense and com¬ 
plexity may drastically reduce the study size, even in 
community-based and nested case-control studies, 
and therefore greatly reduce the statistical power 
for detecting an association between exposure and 
disease. As in traditional epidemiology studies, in 
studies using biomarkers statistical power is a func¬ 
tion of the prevalence of exposure and the magni¬ 
tude of risk; a biomarker with low prevalence and 
high relative risk can be evaluated in small popu¬ 
lations, whereas a biomarker with low prevalence 
and low relative risk requires a larger population. 
The optimal balance between precision and valid¬ 
ity depends on a number of considerations, in¬ 
cluding the relative costs of the various exposure 
measurement techniques (Greenland, 1988). Thus, 
for an expensive biomarker to be useful it must be 
substantially better than less expensive and less in¬ 
vasive approaches. However, it has been argued 
that the necessary study size in some molecular 
epidemiology studies may be smaller than in tra¬ 
ditional epidemiology (Hulka, 1990a,b; Hertzberg 
& Russek-Cohen, 1993) because of larger differ¬ 
ences in biomarker distribution, identification of 
subgroups at higher risk, and the use of continuous 
outcome variables (Boffetta, 1995). While this is 
true in many cases (e.g. the detection of mutations 
in critical genes as a marker of increased risk of 
cancer), in other cases the biomarkers may show a 
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very low (or very high) prevalence, thus requiring 
large samples to detea a difference between groups 
(Hulka & Margolin, 1992; Rothman et a!., 1995). 
Finally, it should be noted that some biomarkers 
are of interest in themselves, rather than function¬ 
ing as surrogates for other exposures; in particular, 
no alternative methods exist for measuring mark¬ 
ers of genetic susceptibility. Nevertheless, the ad¬ 
ditional information provided by the use of such 
markers should still be compared with that pro¬ 
vided by alternative, larger studies in which the 
marker is not used. 

An additional consideration in study size esti¬ 
mation in studies using biomarkers is the ratio of 
the number of assays per individual and the num¬ 
ber of individuals in the study (Boffetta, 1995), 
such as the detection of chromosomal aberrations 
or sister chromatid exchanges. In this case, studies 
must be based on adequate numbers of individuals 
and observations per individual (Hirsch et til., 
1984; Whorton, 1985). Many biomarkers show 
marked variation from day to day within the same 
individual, and the intra-individual variation may 
be greater than the interindividual variation 
(Armstrong et ai, 1992). It may therefore be neces¬ 
sary to take a large number of measurements to 
accurately estimate the average exposure level for 
each individual; otherwise it may be impossible to 
detect differences between individuals. For exam¬ 
ple, for 24-hour urinary sodium, the within-person 
variation may be three times as high as the be- 
tween-person variation; it has been estimated that 
the misclassification resulting from taking only 
one sample per person would result in a true rela¬ 
tive risk of 2.0 being reduced to an observed rela¬ 
tive risk of 1.2 (Armstrong et id., 1992). Thus, it may 
be necessary to take 10-15 24-hour urine samples 
in order to achieve reasonable accuracy in estimat¬ 
ing average individual sodium intake levels. 

Interaction 

Interaction (effect modification) occurs when the 
estimate of effect of exposure depends on the level 
of another factor in the study base (Miettinen, 
1974). The term statistical interaction denotes a 
similar phenomenon in the observed data. The for¬ 
mer term will generally be used here. Interaction is 
distinct from confounding (or selection or infor¬ 
mation bias) in that it does not represent a bias 
which should be removed or controlled, but rather 
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a real difference in the effect of exposure in various 
subgroups that maybe of considerable interest. For 
example, in a cohort study of passive smoking and 
asthma in children, the effect estimate for 
passive smoking might be different in different age 
groups, or in males and females. The clearest 
example of interaction is when a factor is actually 
hazardous in one group and protective in another 
group. More generally, the risk might be elevated 
in both groups, but the strength of the effect may 
vary. A typical example of effect modification in 
studies using biomarkers is the estimate of the risk 
of disease due to an external agent in subgroups of 
the population with a different genetic suscepti¬ 
bility marker, such as the polymorphism for an en¬ 
zyme implicated in the activation or detoxification 
of the agent (see Landi & Caporasi, this volume). 
In this situation, effect modification should be in¬ 
terpreted with considerable care, since the pres¬ 
ence of statistical interaction may depend on the 
statistical methods used, in fact, all secondary risk 
factors modify either the rate ratio or the rate dif¬ 
ference, as uniformity over one measure implies 
non-uniformity over the other. If the assessment of 
the joint effects of two factors is a fundamental 
goal of the study, this can be done by calculating 
stratum-specific effect estimates. It is less clear how 
to proceed if statistical interaction is occurring, but 
assessment of joint effects is not an anafyticai goal. 
Some authors (e.g. Kleinbaum et al., 1982) argue 
that it is not appropriate in this situation to calcu¬ 
late an overall estimate of effect summarized across 
levels of the effect modifier. However, it is com¬ 
mon to ignore this stipulation if the difference in 
effect estimates is not too great (Pearce, 1989). In 
fact, valid analytical methods (e.g. standardized 
rate ratios) have been specifically developed for 
this situation (Rothman, 1986). 

Biomarkers may provide better opportunities 
for assessing interactions between two or more 
genetic and/or environmental factors. In particu¬ 
lar, genetic susceptibility genes should produce a 
higher disease risk for exposed susceptible groups 
than for non-susceptible and non-exposed groups 
(Boffetta, 1995). 

However, a major problem of testing for inter¬ 
action, e.g. in studies involving markers of genetic 
susceptibility, is that it usually requires a substan¬ 
tial increase in study size. For example, in a case- 
control studv, testing tor interaction involves com¬ 


paring the sizes of the odds ratios (relating exposure 
and disease) in different strata of the effect modi¬ 
fier, rather than merely testing whether the overall 
odds ratio is different from the null value of 1.0. 
The power of the test for interaction therefore 
depends on the numbers of cases and controls in 
specific strata (of the effect modifier) rather than the 
overall numbers of cases and controls. For exam¬ 
ple, Smith and Day (1984) give an example of a 
case-control study that would have fo be five times 
larger to detect a difference between odds ratios of 
1.0 and 2.0 in the two different strata of an effect 
modifier than it would have to be to detect an 
overall odds ratio of 2.0 (ignoring the effect modi¬ 
fier). In general, when considering possible inter¬ 
actions, the size of the study needs to be at least 
four times larger than when interaction is not con¬ 
sidered (Smith & Day, 1984). Thus, in a study in¬ 
volving markers of genetic susceptibility, the gain 
in statistical power from considering such markers 
(thereby yielding higher relative risks in some 
strata) may be offset by the decrease in statistical 
power from the need to consider interactions. 
However, if the exposure of interest is independent 
from the genetic factor under study, case-case 
comparisons can be used to study interactions 
with greater statistical power (Piegorsch et ai., 
1984). 

Conclusions 

In some instances, the increasing use of biomark¬ 
ers in epidemiological studies represents a major 
improvement in the discipline during the last years 
(Vineis, 1992), In many cases, biomarkers have 
helped to improve our knowledge of causes and 
mechanisms of both disease etiology and preven¬ 
tion. In other cases, however, it is unclear whether 
they represent an improvement on traditional epi¬ 
demiological methods. Epidemiological studies 
based on biomarkers are usually more complex 
than traditional epidemiological studies, because 
information is available on a larger number of vari¬ 
ables whose biological meaning is often poorly 
known. The methodological considerations in¬ 
volved in classical epidemiological studies on 
issues such as measurement of disease, measure¬ 
ment of exposure, selection bias, confounding, 
precision and interaction also apply to bxomarker- 
based studies, and in most cases the methodologi¬ 
cal problems of this type of study ao not require 
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solutions different from those used in classical 
studies. In some cases, however, the use of bio- 
markers may pose specific problems, which have 
to be addressed within the general framework of 
modern epidemiological methods. 
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